NEIGHBORWATCHER: A Content-Agnostic Comment Spam Inference System

نویسندگان

  • Jialong Zhang
  • Guofei Gu
چکیده

Comment spam has become a popular means for spammers to attract direct visits to target websites, or to manipulate search ranks of the target websites. Through posting a small number of spam messages on each victim website (e.g., normal websites such as forums, wikis, guestbooks, and blogs, which we term as spam harbors in this paper) but spamming on a large variety of harbors, spammers can not only directly inherit some reputations from these harbors but also avoid content-based detection systems deployed on these harbors. To find such qualified harbors, spammers always have their own preferred ways based on their available resources and the cost (e.g., easiness of automatic posting, chances of content sanitization on the website). As a result, they will generate their own relatively stable set of harbors proved to be easy and friendly to post their spam, which we refer to as their spamming infrastructure. Our measurement also shows that for different spammers, their spamming infrastructures are typically different, although sometimes with some overlap. This paper presents NEIGHBORWATCHER, a comment spam inference system that exploits spammers’ spamming infrastructure information to infer comment spam. At its core, NEIGHBORWATCHER runs a graph-based algorithm to characterize the spamming neighbor relationship, and reports a spam link when the same link also appears in the harbor’s clique neighbors. Starting from a small seed set of known spam links, our system inferred roughly 91,636 comment spam, and 16,694 spam harbors that are frequently utilized by comment spammers. Furthermore, our evaluation on real-world data shows that NEIGHBORWATCHER can keep inferring new comment spam and finding new spam harbors every day.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CABD: A Content Agnostic Botnet Detection System

A botnet is a network of compromised hosts controlled by a single entity, called the botmaster. These compromised hosts can be utilized for malicious activities such as Distributed Denial of Service (DDoS) attacks, SPAM, and information extraction such as the extraction of user authentication via key-logging each of which nets profits to the botmaster. Research in the detection of botnets is ex...

متن کامل

A Self-Supervised Approach to Comment Spam Detection Based on Content Analysis

This paper studies the problems and threats posed by a type of spam in the blogosphere, called blog comment spam. It explores the challenges introduced by comment spam, generalizing the analysis substantially to any other short text type spam. The authors analyze different high-level features of spam and legitimate comments based on the content of blog postings. The authors use these features t...

متن کامل

Detecting Comment Spam through Content Analysis

In the Web 2.0 eras, the individual Internet users can also act as information providers, releasing information or making comments conveniently. However, some participants may spread irresponsible remarks or express irrelevant comments for commercial interests. This kind of socalled comment spam severely hurts the information quality. This paper tries to automatically detect comment spam throug...

متن کامل

Library blogs and user participation: a survey about comment spam in library blogs

Purpose The purpose of this research is to identify and describe the impact of comment spam in library blogs. Three research questions guided the study: current level of commenting in library blogs; librarians' perception of comment spam; and techniques used to address the comment spam problem. Design/methodology/approach A quantitative approach is used to investigate research questions. Inform...

متن کامل

Poster: Effort-based Detection of Comment Spammers

Social media has become ubiquitous and important for content sharing. A typical example of how users contribute content to a social media platform is through comment threads in online articles. Unfortunately, there is an increasing prevalence of malicious activity in these threads by spammers through comment messages. The existing approaches tackling comment spam are comment-level in that they ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013